Tiling on systems with communication/computation overlap

نویسندگان

  • Pierre-Yves Calland
  • Jack J. Dongarra
  • Yves Robert
چکیده

In the framework of fully permutable loops, tiling is a compiler technique (also known as ‘loop blocking’) that has been extensively studied as a source-to-source program transformation. Little work has been devoted to the mapping and scheduling of the tiles on to physical parallel processors. We present several new results in the context of limited computational resources and assuming communication–computation overlap. In particular, under some reasonable assumptions, we derive the optimal mapping and scheduling of tiles to physical processors. Copyright  1999 John Wiley & Sons, Ltd.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tiling with limited resources

In the framework of perfect loop nests with uniform dependences, tiling has been extensively studied as a source-to-source program transformation. Little work has been devoted to the mapping and scheduling of the tiles on to physical processors. We present several new results in the context of limited computational resources, and assuming communication-computation overlap. In particular, under ...

متن کامل

Delivering High Performance to Parallel Applications Using Advanced Scheduling

This paper presents a complete framework for the parallelization of nested loops by applying tiling transformation and automatically generating MPI code allowing for an advanced scheduling scheme. In particular, under advanced scheduling scheme we consider two separate techniques: first, the application of a suitable tiling transformation, and second the overlapping of computation and communica...

متن کامل

Communication Scheduling as a First-Class Citizen in Distributed Machine Learning Systems

State-of-the-art machine learning systems rely on graphbased models, with the distributed training of these models being the norm in AI-powered production pipelines. The performance of these communication-heavy systems depends on the effective overlap of communication and computation. While the overlap challenge has been addressed in systems with simpler model representations, it remains an ope...

متن کامل

Optimizing Metacomputing with Communication-Computation Overlap

In the framework of distributed object systems, this paper presents the concepts and an implementation of an overlapping mechanism between communication and computation. This mechanism allows to decrease the execution time of a remote method invocation with parameters of large size. Its implementation and related experiments in the C++// language running on top of Globus and Nexus are described.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Concurrency - Practice and Experience

دوره 11  شماره 

صفحات  -

تاریخ انتشار 1999